Skip to content

non-square VAE tiling #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 18, 2025
Merged

Conversation

wbruna
Copy link

@wbruna wbruna commented Jun 15, 2025

Just an experiment to give us even more parameters to play with 🙂

I noticed sd_tiling didn't have anything forcing the tiles to be square, so I've extended SD_TILE_SIZE to support specifying tile sizes directly (like "32x48"), or as a floating-point factor applied to the image dimensions.

The factor thing is interesting, because it can be used to keep the number of tiles in both directions more or less constant, which makes a specific value fast for many different image ratios. For instance, around 15.2 (weird number because I just apply it as-is right now) is just enough to keep it at 4 iterations, which on my card is consistently faster than non-tiled VAE. In hindsight, it probably makes more sense to specify it directly as a number of tiles in each direction, though.

@stduhpf
Copy link
Owner

stduhpf commented Jun 17, 2025

Overall nice work, I like the way you refactored the sd_tiling_calc_tiles() thing.

@wbruna wbruna marked this pull request as draft June 17, 2025 20:01
@wbruna wbruna force-pushed the vae_tiling_non_square branch from 7ab3d61 to 680140d Compare June 17, 2025 21:25
@wbruna
Copy link
Author

wbruna commented Jun 17, 2025

I've fixed the latent/image confusion, and added support for two floating point factors for the env var.

About the factor: I don't think it makes sense to have more than 1 tile across a dimension, so I'm considering a factor below 1 as meaning "fraction of the latent dimension", and above 1 "how many tiles across the latent dimension", so we can try out both to see which one works best.

@wbruna wbruna marked this pull request as ready for review June 17, 2025 21:47
@stduhpf
Copy link
Owner

stduhpf commented Jun 17, 2025

"how many tiles across the latent dimension"

That's not taking the overlap into account, right?

@wbruna
Copy link
Author

wbruna commented Jun 18, 2025

That's not taking the overlap into account, right?

No, just a plain division right now.

@stduhpf
Copy link
Owner

stduhpf commented Jun 18, 2025

The way to compute the tile size so there are only N tiles in one dimension, with an overlap factor of O would be to set the factor (multiplicative) to something like (1-O)/(N-O). If you want to add that, let me know, otherwise I will just merge this later today as it is.

@wbruna
Copy link
Author

wbruna commented Jun 18, 2025

Hm... Just let me check if I'm understanding how that overlapping factor works.

We want N tiles, of size T, to fill the length L. To avoid boundary issues, we start by placing a first tile T. If the overlapping factor means "which fraction of the tile is overlapping with its neighbor", each of the additional N-1 tiles will overlap T · overlap with the already placed tile, thus covering an additional area of T · (1-overlap). So we have:

L = T + (N-1)*T*(1-overlap)
T = L / ( 1 + (N-1)*(1-overlap) )
T = L / ( N - N*overlap + overlap )

For a 64 side with 3 tiles and overlap 0.5, this gives me a pretty reasonable tile size 32.

Am I on the right track here? 🙂

@stduhpf
Copy link
Owner

stduhpf commented Jun 18, 2025

Yes that seems right. I'm not sure how I ended up with the (1-O)/(N-O).

@wbruna
Copy link
Author

wbruna commented Jun 18, 2025

It's mostly working. I had to request slightly different sizes, depending on the overlapping factor; for instance, for a 768x768 image with overlapping 0.5, a 4.0 factor gives me 4 tiles across; lowering the overlapping to 0.3, I get 5; but 3.9 gives me the expected 4... so it could be just rounding.

@wbruna
Copy link
Author

wbruna commented Jun 18, 2025

It's mostly working. I had to request slightly different sizes, depending on the overlapping factor; for instance, for a 768x768 image with overlapping 0.5, a 4.0 factor gives me 4 tiles across; lowering the overlapping to 0.3, I get 5; but 3.9 gives me the expected 4... so it could be just rounding.

Indeed, adding std::round calls when multiplying by the factors seems to avoid that weirdness.

Comment on lines +619 to +623
if ((overshoot_dim != non_tile_overlap) && (overshoot_dim <= num_tiles_dim * (tile_size / 2 - tile_overlap))) {
// if tiles don't fit perfectly using the desired overlap
// and there is enough room to squeeze an extra tile without overlap becoming >0.5
num_tiles_dim++;
}
Copy link
Owner

@stduhpf stduhpf Jun 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wbruna I think it's actually because of this.
If I remember correctly I added this to make sure the overlap was preferably bigger rather than smaller than the target (because less overlap tend to cause more noticable transitions).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If your fix with rounding doesn't work, removing these would work, though I think it's preferable to keep it.

Copy link
Owner

@stduhpf stduhpf Jun 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be working fine so far, no matter the overlap.

@stduhpf
Copy link
Owner

stduhpf commented Jun 18, 2025

All good now i think

@stduhpf stduhpf merged commit ac708e8 into stduhpf:bleedingedge Jun 18, 2025
stduhpf pushed a commit that referenced this pull request Jun 18, 2025
* refactor tile number calculation

* support non-square tiles

* add env var to change tile overlap

* add safeguards and better error messages for SD_TILE_OVERLAP

* add safeguards and include overlapping factor for SD_TILE_SIZE

* avoid rounding issues when specifying SD_TILE_SIZE as a factor

* lower SD_TILE_OVERLAP limit

* zero-init empty output buffer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants